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EFFECT  OF  SPATIAL  AND  TEMPORAL  VIDEO  IMAGE 


COMPRESSION  ON  MILITARY  TARGET  DETECTION 


I.  INTRODUCTION 

In  recent  years  there  has  been  increasing  interest  in  visual  information  processing. 
That  interest  has  been  brought  about,  in  part,  by  advances  in  communication  satellite 
technology,  computer  technology,  military  systems,  and  educational  systems.  Whether 
a  given  application  is  in  national  defense,  general  communications,  or  education,  major 
system  trade-offs  must  be  evaluated  at  the  interface  between  hardware/display  and  the 
user/observer.  In  general,  the  objective  is  to  provide  a  level  of  display  fidelity  appro¬ 
priate  to  the  observer’s  capabilities  and  task  requirements  while  considering  the  opera¬ 
tional  environment  of  the  overall  system  (e.g.,  military  combat  versus  civilian  peace¬ 
time).  In  systems  involving  video  images  (television  monitors),  image  compression  and 
frame  rate  are  two  prominent  variables  from  the  standpoint  of  system  cost-effective¬ 
ness  and  observer  performance  reliability.  The  digital  compression  of  image  informa¬ 
tion  has  at  least  two  major  benefits  of  widespread  application  -  one  in  the  military 
and  the  other  in  nonmilitary  systems.  The  primary  military  advantage  is  that  the 
associated  Radio  Frequency  (RF)  transmissions  are  less  susceptible  to  deliberate  inter¬ 
ference  (i.e.,  “jamming”).  The  nonmilitary  benefit  is  that  image  compression  allows 
more  information  to  be  transmitted  faster,  thus  saving  not  only  time  but  energy  as  well. 

In  terms  of  a  simple  analogy,  image  compression  may  be  characterized  as  a  scene 
or  an  object  represented  in  a  twodimensional  plane  (e.g.,  as  a  photograph  or  a  video 
display).  An  extremely  fine  grid  is  placed  over  the  scene.  To  represent  the  scene,  each 
square  or  cell  of  the  grid  is  assigned  a  value  on  a  scale  with  values  corresponding  to 
shades  of  contrast  from  low  to  high.  The  finer  the  grid,  the  greater  the  number  of 
gradations  on  the  light-to-dark  scale  and  the  higher  tilt  fidelity  of  the  display.  To 
transmit  the  information  thus  represented,  the  specific  row  and  column  designations 
and  the  light-dark  values  assigned  to  each  cell  would  have  to  be  sent.  As  the  require¬ 
ment  for  display  fidelity  increases,  more  time  and  energy  is  needed  for  transmission. 

When  a  human  observer  uses  the  display  to  detect  and  classify  objects  (targets), 
the  required  level  of  fidelity  depends  upon  several  factors.  Among  these  are  the  rela¬ 
tive  size  and  shape  of  the  targets  in  the  scene,  the  similiarity  of  the  figures  designated 
as  targets  to  other  objects  (non-targets),  and  the  level  and  distribution  of  intensity  con¬ 
trasts  in  the  scene.  The  level  of  fidelity  which  includes  the  degree  of  visual  detail  and 
resolution  needed  is  dependent  on  the  observer’s  visual  and  perceptual  capability  to 
detect,  recognize,  and  classify  targets  at  specified  levels  of  reliability. 
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Given  the  grid  format  as  discussed  above,  image  compression  is  a  technique  of 
representing  the  scene  with  a  lesser  number  of  cells  and  light-dark  gradations.  The 
compression  or  transformation  is  accomplished  by  replacing  groups  of  cells  with  a 
single  cell  representing  the  average  value.  The  resulting  grid  is  comprised  of  fewer  cells 
or  subdivisions,  with  the  gradations  across  cells  less  gradual  than  before.  Displays  sub¬ 
jected  to  this  type  of  image  compression  are  seen  as  “blocky,”  as  though  the  scene  was 
depicted  by  the  arrangement  of  tiny  rectangles  with  a  limited  range  of  contrast. 
Where  the  target  is  a  rectilinear  form  of  uniform  contrast  (e.g.,  an  office  building)  a 
moderate  amount  of  image  compression  does  not  seriously  degrade  target  visibility. 
Where  the  target  is  of  irregular  shape,  especially  with  curved  lines  in  its  borders,  a 
modest  amount  of  image  compression  may  render  the  target  virtually  unrecognizable. 
This  is  more  likely  where  the  target  is  small  relative  to  the  elements  of  the  display  grid. 

To  the  foregoing  considerations  one  may  add  the  dimension  of  motion.  This 
occurs  whenever  the  camera  (sensor  system)  or  the  target  is  in  motion  relative  to  the 
other.  Whereas  image  fidelity  of  the  grid  representation  is  similar  to  having  many  tiny 
picture  elements  presented  contiguously  and  simultaneously,  representation  of  relative 
motion  is  analogous  to  creating  and  displaying  many  such  grids,  one  after  the  other. 
A  similar  principle  is  that  used  by  the  common  motion  picture  camera.  If  the  series  of 
grids  (frames)  is  photographed  rapidly  at  fixed  intervals  and  played  back  at  the  same 
rate,  the  relative  motion  will  be  displayed  with  high  fidelity.  If  the  frames  are  not 
taken  at  a  high  enough  rate  (which  depends  on  the  rate  of  movement  being  photo¬ 
graphed),  the  subject  in  the  playback  will  appear  to  move  in  a  stepwise  fashion  rather 
than  continuously. 

The  frame  rate  needed  is  similar  to  the  degree  of  fidelity  required  in  each  frame 
or  grid  representation.  That  is,  the  frame  rate  is  dependent  on  the  needed  degree  of 
reliability  which  is  a  function  of  the  scene  viewed  and  the  targets  to  be  detected  and 
recognized.  And,  as  with  the  fidelity  of  each  frame,  the  more  frames  per  unit  time 
required,  the  more  time  and  energy  needed  for  their  transmission. 

Reducing  the  display  fidelity  may  result  in  deterioration  of  the  observer’s  per¬ 
formance.  The  degree  of  degradation  depends  on  the  target  detection  and  recognition 
requirements  placed  on  the  human  observer.  Image  compression  and  frame  rate  reduc¬ 
tion  represent  a  decrease  in  display  fidelity.  Furthermore,  image  compression  used  in 
conjunction  with  reduced  frame  rates  may  result  in  degradation  of  observer  per¬ 
formance  greater  than  the  sum  of  the  effects  of  each  variable  alone  (i.e.,  an  interaction 
effect  of  image  compression  and  frame  rate  on  target  detection  and  recognition). 
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a.  Purpose,  The  present  study  was  designed  to  investigate: 


recognition. 


(!)  The  effect  of  image  compression  on  target  detection  and 


(2)  The  effect  of  frame  rate  on  target  detection  and  recognition. 


(3)  The  effect  of  a  possible  image  compression  by  frame  rate  inter¬ 
action  on  target  detection  and  recognition. 

In  this  study,  the  terms  "detection”  and  "recognition”  refer  to  two  successive  levels  of 
visual  discrimination.  “Detection”  represents  the  first  level  of  judgment  by  the  obser¬ 
ver  -  that  an  object  in  the  scene  is  a  member  of  the  class  of  objects  designated  as 
targets.  For  example,  the  observer  may  report  a  “detection”  on  the  basis  of  an  object’s 
outline,  contrast,  shadow  pattern,  and  so  on.  The  discrimination  at  that  time  may  be 
simply  between  man-made  and  natural  objects,  or  it  may  be  to  the  level  of  a  general 
category  of  man-made  objects  such  as  buildings,  roads,  or  vehicles. 

The  second  stage,  "recognition,”  is  when  the  observer  specifies  the 
detected  target  to  a  greater  degree ;  i.e..  having  "detected”  a  target,  he  now  “recognises” 
it  as  a  “jeep.”  By  this  definition,  recognition  presupposes  detection. 


b.  Experimental  Context.  As  a  basic  research  issue  there  are  many 
variables  and  interactions  to  be  evaluated  in  image  processing.  Tests  performed  with 
simplified  extractions  of  the  independent  variable  (e.g.,  geometric  shapes,  patterns, 
etc.)  would  be  useful  in  identifying  the  effects  on  the  dependent  variables  (target 
detection  and  recognition).  As  an  issue  of  applied  research,  however,  it  is  possible  to 
evaluate  directly  the  influence  of  the  display  parameters  on  observer  performance. 
And,  taking  the  applied  approach,  the  results  would  be  more  relevant  to  the  opera¬ 
tional  system  evaluated  with  the  data  also  providing  insight  into  the  basic  display- 
observer  interface.  The  latter  approach  was  taken  in  the  present  study.  In  this  investi¬ 
gation,  observers  viewed  video  monitors  that  show  prerecorded  scenes  of  aerial  recon¬ 
naissance  flights.  The  task  of  the  observers  was  to  search  the  scene  for  the  presence 
(detection)  of  a  vehicle  and  then  to  identify  (recognition)  the  vehicle  by  type  (i.e.. 
military  jeep  or  truck  of  a  given  size).  The  operational  system  associated  with  this  task 
is  described  below. 


c.  Remotely  Piloted  Vehicle  (RPV)  System.  The  function  of  the  remotely 
piloted  vehicle  (RPV)  system  is  to  obtain  information  by  means  of  aerial  reconnais¬ 
sance.  In  the  RPV  system,  the  airborne  vehicle  carries  a  video  camera  that  flies  a  pre- 
established  computer-controlled  course.  From  the  aircraft,  two  types  of  information 
are  transmitted  back  to  its  base  and  received  by  two  persons,  with  each  person  having 
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a  different  function.  One  type  of  information  is  that  which  permits  the  course  and 
location  of  the  aircraft  to  be  monitored  while  in  flight.  This  monitoring  is  done  by  the 
first  of  the  two  persons  mentioned.  The  monitor  or  “controller,”  as  is  the  common 
designation,  does  not  fly  the  aircraft  by  remote  control  in  a  continuous  mode.  Rather, 
the  flight  path  program  is  updated  periodically  by  use  of  a  standard  keyboard  input 
to  the  computer.  The  controller’s  display  is  a  standard  Cathode  Ray  Tube  (CRT) 
radar-type  display  upon  which  the  remotely  piloted  vehicle  is  represented  by  a  moving 
point  of  light.  (The  controller  does  not  see  the  output  of  the  airborne  video  camera.) 
A  direct  video  scene  of  the  ground  forward  of  the  aircraft  is  monitored  by  the  second 
person,  designated  as  the  “observer.”  The  observer  searches  the  scene  for  prescribed 
targets.  When  a  target  is  detected  or  suspected  to  be  present,  its  apparent  location  is 
communicated  to  the  controller.  The  controller  updates  the  computer-controlled 
flight  path  to  put  the  aircraft  on  course  to  the  suspected  location  of  the  target,  thus 
providing  the  observer  with  an  improved  view  of  the  area  in  question. 

A  typical  RPV  system  is  designed  to  transmit  analog  information  back 
to  its  base  through  a  wide-band  RF  signal.  Such  signals,  however,  may  be  disrupted 
readily  (“jammed”)  by  electronic  countermeasures.  When  this  happens,  not  only  is  the 
video  reconnaissance  information  lost  but  control  of  the  vehicle  is  jeopardized  as  well. 
This  susceptibility  to  jamming  can  be  alleviated  by  reducing  the  bandwidth  of  the  RF 
signal.  To  do  so,  however,  requires  alteration  of  the  temporal  or  spatial  characteristics 
of  the  video  information  or  both,  depending  on  the  reduction  process  used. 

The  temporal  characteristic  of  the  information  is  essentially  synony¬ 
mous  with  frame  rate  as  described  earlier.  The  procedure  is  to  convert  the  video  infor¬ 
mation  from  analog  (primary  image  rate  is  30  frames  per  second)  to  digital  form  prior 
to  transmission.  Once  converted,  the  total  set  of  frames  (a  second,  minute,  hour,  or 
any  desired  unit  of  time)  may  be  sampled  and  just  the  sample  transmitted.  By  select¬ 
ing  and  transmitting  every  other  frame,  for  example,  the  view  would  be  temporarily 
reduced  or  “compressed”  by  50  percent.  Correspondingly,  to  provide  the  video  obser¬ 
ver  with  a  continuous  display,  each  frame  would  be  presented  for  twice  its  normal 
duration.  For  the  aerial  reconnaissance  scenes  used  in  the  present  study,  a  reduction 
from  30  frames  per  second  (analog  rate)  to  20  frames  per  second  does  not  markedly 
disrupt  the  apparent  smoothness  of  the  view  over  time.  At  10  frames  per  second  the 
display  is  described  subjectively  by  observers  as  “slightly  choppy.”  At  lower  frame 
rates  there  is  an  obvious  stepwise  change  in  the  scene  from  one  frame  to  the  next.  The 
effect  of  such  reductions  in  frame  rate  on  the  observer’s  ability  to  detect  and  recognize' 
military  targets  was  one  of  the  main  interests  of  this  investigation. 
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In  addition  to  altering  the  temporal  characteristic  of  the  video  infor¬ 
mation  to  be  transmitted,  it  is  possible,  as  discussed  earlier,  to  vary  or  “compress”  its 
spatial  characteristic  as  well.  The  technique  is  a  complex  averaging  process  applied  to 
the  many  elements  comprising  the  video  scene.  These  elements  are  dots  and  lines 
which  vary  in  contrast  to  represent  the  visual  scene  and  which  may  be  specified  quanti¬ 
tatively  as  “bits  of  information.”  In  the  process  of  compression,  the  number  of  bits  is 
reduced  by  replacing  details  with  an  average  value. 

The  effect  on  the  video  scene  by  increasing  levels  of  spatial  compression 
is  illustrated  in  Figure  1.  The  views  shown  are:  (a)  analog  (no  compression), 
(b)  reduction  to  8  bits  per  picture  element  (“pixel”),  (c)  reduction  to  2  bits  per  picture 
element,  and  (d)  reduction  to  0.5  bit  per  picture  element.  In  the  scene  shown,  the  tar¬ 
get  is  a  2'/2-ton  truck  located  slightly  below  the  center  of  the  picture.  As  may  be  seen, 
as  compression  is  increased  (bits  per  pixel  arc  reduced),  visual  details  and  gradations 
of  shading  are  decreased  and  the  image  assumes  an  appearance  typically  described  as 
“blocky,”  owing  to  the  black,  white,  and  gray  squares  that  comprise  it. 

A  question  of  major  interest  addressed  by  the  present  investigation  was 
the  extent  to  which  image  compression  effects  observer  detection  and  recognition  of 
military  targets  (vehicles).  Previous  studies1  2  have  examined  the  effect  in  target 
detection/recognition  due  to  image  compression,  but  none  lias  evaluated  the  possible 
interaction  effects  of  simultaneous  reduction  in  temporal  and  spatial  information: 
i.e..  reduction  in  frame  rate  and  bits  per  pixel.  The  present  study  was  designed  to 
address  that  issue 


II.  METHOD 

The  video  tapes  used  in  this  study  were  taken  from  a  set  of  actual  flight  test 
recordings  of  an  RPV  system.  The  tapes  were  reprocessed  to  provide  the  desired  frame 
rate  and  image  compression  levels. 

a.  Selection  and  Processing  of  Video  Reconnaissance  Tapes.  During  a 
preliminary  field  evaluation  of  the  RPV  system  at  Fort  Huachuea.  Arizona,  approxi¬ 
mately  250  video  tapes  of  aerial  reconnaissance  trials  were  produced.  Each  tape  con¬ 
tained  10  “runs.”  with  a  run  comprised  of  a  single,  straight  and  level  (light  segment 
oriented  so  as  to  fly  directly  toward  and  over  a  specified  target  on  the  ground.  The 
video  camera  was  carried  by  a  Cessna  Super  Sky  master  aircraft  equipped  with  a  short- 
takeoff  and  -landing  system  (STOL).  All  runs  were  made  at  a  constant  altitude  of 

M.  1„  Hershberger  and  R.  J.  Vandervalk.  Video  Image  Bandwidth  Redtirnon/Coniph'ssion  Studies  for  Remotely 
Minted  i  eh  teles.  Tech  Rpt.  ASlJ-76,  Muehcs  Aircraft  Compact.  Cuhcr  Ol\.  (A  (Oclobei  1  ^  7  (>  ♦ . 

it.  (  Soil  anil  S.  A.  Heckart.  H  Target  -Xeqttistttft  at  \ annus  frame  Runs  Toi  h  Rpr  AMKI  lR-"3-ili. 
Aerospace  Medical  Research  l.jhnraMrv .  W  ridit-Paltcrson  AI  M.  Ohio  tScpicmhci  1973). 
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e  of  the  effect  on  the  video  scene  due  to  increasing  the  levels  of  spatial  compression: 
compression;  (b)  8  bits  per  pixel;  (c)  2  bits  per  pixel;  (d)  0.5  bit  per  pixel. 


1.500  feet  with  the  camera  looking  forward  and  downward  with,  respect  to  the  air¬ 
craft’s  trajectory.  Each  run  was  photographed  twice:  once  with  a  narrow  (6.5°) 
field-of-view  lens  and  once  with  a  wide-angle  (10°)  lens.  All  runs  were  made  at  the 
same  altitude,  heading,  and  ground  speed.  All  runs  were  over  the  same  section  of 
terrain.  The  only  conditions  to  vary  from  one  run  to  the  next  were  the  type  and  loca¬ 
tion  of  target  and  the  field-of-view  (6.5°  versus  16°).  All  of  the  original  tapes  were 
analog  recordings. 

b.  Visual  Background.  The  ease  or  difficulty  with  which  visual  targets  can 
be  detected  depends  in  part  upon  the  background  against  which  they  appear.  The 
present  target  surround  consisted  of  light  colored  sand  populated  by  clusters  of  scrub 
pine  and  other  indigenous  flora.  The  effect  was  a  mottled  background  (see  Figure  1) 
that  provided  substantial  visual  noise  or  competition  in  searching  for  targets. 

All  250  tapes  were  reviewed  for  possible  use  in  the  present  study,  and 
a  set  of  runs  was  selected  which  represented  the  following  basic  conditions:  (a)  four 
types  of  target  vehicle  (jeep;  5/4-ton  truck;  2'/2-ton  truck;  armored  personnel  carrier 
(APC));  (b)  reconnaissance  runs  (tape)  for  each  target  (varying  location)  photographed 
variously  with  wide  and  narrow  fields-of-view.  This  basic  material  was  all  in  analog 
format.  On  each  run  there  was  one.  and  only  one.  target. 

From  the  foregoing  tapes  a  master  analog  copy  and  a  series  of  modi¬ 
fied  tapes  were  prepared.  The  modification  was  a  conversion  and  image  compression 
process  which  produces  the  variations  in  spatial  (bits  per  pixel)  and  frame  rate  neces¬ 
sary  to  conduct  the  study. 

The  spatial  compression  was  done  by  the  Northrop  Corporation  using 
an  HAAR  transformation.3  Once  transformed,  the  tape  frames  were  sampled  syste¬ 
matically  and  reproduced  at  the  Night  Vision  and  Electro-Optics  Laboratory  to  give 
the  total  set  of  experimental  conditions  (i.e.,  tapes  with  the  required  combination  of 
spatial  compression  and  frame  rate).  There  were  four  targets  in  all.  each  represented  in 
the  different  spatial  temporal  conditions.  Each  tape  contained  the  same  10  target  runs 
with  the  order  of  runs  randomized  across  tapes. 

c.  Experimental  Design.  A  split-plot  design  was  used  with  subjects  nested 
within  levels  of  spatial  compression  (bits  per  pixel).  Subjects  were  randomly  assigned 
to  their  respective  groups.  Within  a  given  level  of  image  compression  each  subject 
observed  all  remaining  combinations  of  frame  rates,  targets,  and  fields-of-view  (wide 
versus  narrow).  The  spatial  compressions  were:  0.5,  2.  and  8  bits  per  picture  element 


3  T.  Leibhotf.  H.  Henning.  T.  Noda.  and  B.  Deal.  Final  Report  for  Experimental  Development  of  a  FUR  Sensor 
Processor.  Technical  Report  77Y106,  Northrop  Corporation.  Anaheim.  CA  (September  1977). 
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us  well  as  analog.  The  frame  rates  were  1.  3,  b,  10,  ami  30  frames  per  second.  A 
detailed  discussion  of  this  experimental  design  may  be  found  in  Kirk.4 

d.  Subjects.  Subjects  were  48  paid  volunteers  ranging  in  age  from  17  to 
38  years,  with  a  mean  age  of  28  years.  None  had  prior  experience  with  an  RPV  system 
or  with  military  target  detection  in  general.  All  subjects  were  examined  by  an 
optometrist  prior  to  the  study  to  ensure  that  no  visual  anomalies  existed  that  would 
adversely  affect  the  study.  All  subjects  had  visual  acuity  of  at  least  20/30,  uncorrected 
or.  if  necessary,  with  glasses 

e.  Apparatus.  The  equipment  was  arranged  so  that  four  observers  could 
be  tested  simultaneously.  Each  observer  performed  in  a  cubicle  that  was  visually  iso¬ 
lated  from  the  other  three  observers  and  from  the  experimenter. 

An  observer  station  consisted  of  a  15-inch,  black-and-white  video 
monitor  (Ball  Miratel  Model  BH-I5)  on  which  the  observer  viewed  the  aerial  recon¬ 
naissance  scene  and  searched  for  the  target.  The  observer  responded  by  means  of  a  set 
of  six  switches  mounted  on  a  single  control  box.  The  functions  of  the  switches  were  as 
follows:  Number  I  was  used  to  report  "target  detection;”  Numbers  2  through  5  were 
used  to  report  recognition  of  the  jeep,  5/4-ton  truck,  2'/2-ton  truck,  and  armored  per¬ 
sonnel  carrier,  respectively.  Number  6  switch  was  designated  as  a  “reset"  and  was  used 
by  the  observer  to  cancel  any  responses  made  within  a  trial  as.  for  example,  to  cancel 
an  incorrect  “target  detected”  response  when,  upon  closer  inspection  of  the  scene  by 
the  observer  that  response  proved  to  be  in  error. 

The  video  tapes  were  played  on  a  Sony  tape  deck.  Model  TC-2000.  The 
signal  was  fed  to  the  four  subject  monitors  through  a  Panasonic  distribution  amplifier 
(Model  WJ-300).  Contrast  levels  of  the  four  monitors  were  equalized  within  the  limits 
of  their  available  range  and  with  reference  to  a  Spectra-Pritchard  Model  1980 
photometer. 

Presentation  of  the  taped  material  and  recording  of  observers’  responses 
was  accomplished  by  a  Hewlett-Packard  calculator  (Model  9825A).  To  start  a  trial 
(tape  presentation),  the  experimenter  pressed  a  button  on  the  calculator.  During  the 
trial  observer  responses  were  recorded  separately  (contingent  upon  the  “reset” 
function  as  described  above).  At  the  end  of  each  trial,  upon  command  of  the  experi¬ 
menter.  the  data  were  transferred  to  a  magnetic  disc  (Hewlett-Packard  Model  9885M). 


4  R.  E.  Kirk,  Experimental  Design:  Procedures  for  the  Behavioral  Sciences.  Wadsworth,  Belmont,  CA.  (1968). 
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f.  Procedure.  Subjects  were  assigned  at  random  to  experimental  groups. 
The  first  part  of  the  day  was  devoted  to  orientation  and  practice.  Formal  data  collec¬ 
tion  then  followed,  with  all  tests  on  any  observer  being  completed  with  a  single  day. 

g.  Orientation  and  Practice.  Observers  were  tested  in  groups  of  four. 
Prior  to  conducting  the  formal  data  collection  trials,  the  observers  were  told  the  pur¬ 
pose  of  the  research  and  what  their  task  would  be.  They  were  shown  three-dimensional 
scale  models  of  the  targets  that  they  would  be  searching  for  and  they  were  shown  three 
sample  runs  (not  included  in  the  formal  data  trials)  on  the  monitor.  Finally,  each 
observer  was  given  30  practice  trials  (analog  tape  only ;  no  image  compression  or  frame 
rate  reduction)  during  which  the  experimenter  verified  that  everyone  understood  the 
task  and  the  correct  use  of  the  response  switches.  At  the  end  of  each  training  trial  the 
experimenter  pointed  out  and  identified  the  target.  In  the  training  trials  and  in  later 
test  trials  there  was  always  one,  and  only  one,  target  present.  The  instructional  and 
practice  sessions  lasted  approximately  2Vi  hours. 

h.  Data  Collection.  Each  observer  viewed  a  total  of  five  runs  distributed 
in  equal  numbers  on  five  separate  video  tapes.  As  noted  previously,  there  was  a  differ¬ 
ent  set  of  tapes  for  each  (nested)  group  of  observers.  Each  tape  ran  continuously  for 
35  minutes.  Observers  were  permitted  a  10-minute  rest  period  after  each  of  the  tapes. 
A  1-hour  lunch  break  was  given  at  the  conclusion  of  the  first  test  tape.  The  second 
through  the  fifth  were  run  after  the  lunch  period. 

For  each  group  of  observers,  the  first  tape  presented  was  always  at  a 
rate  of  30  frames  per  second.  The  order  of  the  remaining  tapes,  representing  the 
remaining  frame  rates,  was  randomized.  For  each  data  trial  (run),  an  observer  had  an 
opportunity  to  make  a  detection  response  and  a  recognition  response  by  pressing  the 
corresponding  switches  on  his  panel.  If  an  observer  pressed  his  reset  switch  at  any  time 
during  a  trial,  any  responses  made  previously  during  that  trial  were  automatically 
deleted.  A  trial  was  considered  valid  and  data  were  recorded  only  if  a  target  was  recog¬ 
nized  correctly. 

If  an  observer  reported  “detection”  but  no  subsequent  “recognition,” 
or  if  the  subsequent  recognition  was  in  error,  the  trial  was  treated  as  if  the  observer  had 
made  no  response.  On  each  trial,  however,  the  target  in  the  video  scene  was  always 
approached  close  enough  to  insure  ultimate  recognition  of  the  target.  Whether  or  not 
recognition  was  simultaneous  with  detection  from  the  observer’s  standpoint,  the 
observer  was  required  to  report  a  detection  before  reporting  the  target  name  (jeep, 
APC,  etc.).  Observer  responses  were  recorded  as  elapsed  time  commencing  with  the 
start  of  each  trial.  Given  the  measure  of  elapsed  tune  plus  the  speed  and  altitude  of  the 
photographic  aircraft,  it  was  possible  to  calculate  the  respective  detection  and  recogni¬ 
tion  ranges. 

9 


111.  RESULTS 


Observer  responses  were  recorded  as  the  time  elapsing  between  the  start  of  a  test 
run  and  the  report  of  target  detection  (and.  subsequently,  target  recognition).  Since 
the  length  of  the  original  flight  segment  of  the  camera-carrying  aircraft  was  known, 
along  with  the  aircraft  speed  and  altitude,  it  was  possible  to  convert  each  elapsed  time 
measure  into  the  range  to  target  at  the  point  of  detection  (or  recognition). 

a.  Summary.  The  data  concerning  target  detection  were  of  primary 
importance.  Since  each  detection  response  was  followed  by  a  correct  recognition  of 
the  target  (always  at  a  closer  range,  since  a  detection  report  had  to  be  made  before  a 
recognition  report  would  be  accepted),  the  detection  data  reported  here  represent  100- 
percent  accuracy  of  subsequent  target  recognition.  Results  are  presented  primarily  for 
the  detection  measures  since  the  recognition  data  added  little  more  of  consequence. 

In  brief,  the  wide-angle-view  data  were  associated  with  such  short  target 
detection  ranges  as  to  be  of  no  further  use  in  this  study  from  a  military  operational 
standpoint.  Consequently ,  after  initial  inspection  of  the  data,  that  variable  was 
dropped  from  the  analysis. 

In  general,  a  shorter  average  detection  range  was  associated  with  the 
smaller  target  (jeep);  targets  (size)  did  interact  with  the  spatial  compression  variable. 

Image  spatial  reduction  was  associated  with  a  reduction  in  mean  target 
detection  range.  The  variable  of  frame  rate  reduction,  however,  had  no  noticeable 
effect  on  target  detection  range,  nor  did  this  variable  interact  with  the  image  spatial 
compression  variable. 

b.  Angle  of  View.  The  variable  of  camera  viewing  angle  was  included  in 
the  study  primarily  as  an  operational  issue.  Observer  performance  with  respect  to 
field-of-view  was  found  to  be  significantly  poorer  with  the  wide-angle  view  than  with 
the  narrow  field-of-view.  Figure  2  presents  mean  detection  range  and  mean  recognition 
range  for  each  field-of-view.  The  mean  detection  range  for  the  narrow  view  (6.5°)  was 
1991  meters  compared  to  1052  meters  for  the  wide  view  ( 16°).  The  mean  recognition 
range  was  625  meters  for  the  narrow  view  and  146  meters  for  the  wide  view.  As  may 
be  seen  in  the  figure,  targets  were  detected  at  roughly  twice  the  distance  under  the 
narrow  field-of-view  condition  than  with  the  wide  field-of-view.  Moreover,  the  mean 
recognition  range  with  the  narrow  field-of-view  was  more  than  four  times  greater  than 
with  the  wide  field.  On  the  basis  of  the  poor  performance  with  the  wide  field-of-view. 
the  data  for  that  condition  were  excluded  from  further  consideration  in  the  study. 
The  remaining  results  and  discussion,  therefore,  are  restricted  to  data  obtained  under 
the  narrow  field-of-view  condition. 
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c.  Type  of  Target.  The  targets  were  comprised  of  four  types  of  military 
vehicles.  In  order  of  increasing  size,  they  were:  jeep,  5/4-ton  truck,  2'/2-ton  truck, 
and  armored  personnel  carrier  (APO. 

Figure  3  presents  the  mean  detection  range  for  the  respective  target 
vehicles.  The  mean  detection  ranges  in  meters  were  as  follows:  jeep,  1359;  5/4-ton 
truck,  2281;  2Vj-ton  truck;  2214,  and  APC.  2609.  As  might  be  expected,  the  jeep, 
which  was  the  smallest  of  the  four  targets,  was  associated  with  the  shortest  detection 
range.  The  armored  personnel  carrier,  the  largest  of  the  four  vehicles,  was  detected 
at  the  greatest  range.  On  the  average,  the  APC  was  detected  at  twice  the  distance  of 
the  jeep.  The  longer  detection  range  for  the  APC,  however,  was  found  to  be  due  in 
part  to  a  characteristic  unique  to  that  target  in  the  video  tapes  used.  Sunlight  reflect¬ 
ing  from  the  APC  fording  board  and/or  headlights  produced  a  conspicuous  glint 
pattern  which  artifactually  enhanced  the  detectability  of  the  vehicle  in  the  context  of 
the  present  study.  Thus,  while  the  APC  served  the  study  as  an  alternative  target  and 
thereby  contributed  to  the  realism  of  the  search  task,  the  detection  range  data 
associated  with  it  were  considered  not  to  be  compatible  with  that  obtained  for  the 
other  taigets. 

d.  Spatial  Compression.  A  question  ot  major  interest  in  this  study  was  the 
overall  effect  of  image  spatial  compression  and  whether  the  effect  of  such  compression 
would  interact  with  a  further  reduction  in  image  fidelity  associated  with  reduced  frame 
rates. 


Figure  4  presents  the  overall  mean  detection  range  for  each  level  of 
spatial  compression.  The  means  are  based  on  all  4  target  vehicles  and  all  48  observers. 
Data  are  for  the  narrow  (6.5°)  field-of-view  only.  The  mean  target  detections  were: 
1340  meters  for  0.5  bit;  1880  meters  for  2  bits;  2380  meters  for  8  bits;  and  2375 
meters  for  analog  conditions.  As  may  be  seen  in  the  figure,  a  reduction  of  image  infor¬ 
mation  from  analog  to  8  bits  per  picture  element  resulted  in  no  difference  in  mean  tar¬ 
get  detection  range.  When  the  image  was  compressed  from  8  bits  to  2  bits,  the  mean 
detection  range  fell  by  a  factor  of  20  percent,  from  2375  meters  to  1880  meters.  At 
0.5-bit-per-picture  element,  the  mean  detection  range  was  1340  meters,  representing  a 
44-percent  loss  in  range  over  the  8-bit  and  analog  conditions,  respectively. 


Mean  Target  Detection  Range  (Meters) 


e.  Frame  Rate.  The  effect  of  image  temporal  compression  (i.e.,  reduced 
frame  rate)  on  target  detection  range  was  evaluated  at  five  levels;  these  were  1,  3,  6, 
10,  and  30  frames  per  second  (FPS).  The  mean  detection  ranges  in  meters  were: 
1945  for  I  FPS;  2030  for  3  FPS;  2018  for  6  FPS;  1920  for  10  FPS;and  2060  for  30 
FPS.  The  overall  means  are  presented  in  Figure  5.  As  may  be  seen  in  the  figure,  there 
was  no  appreciable  difference  in  mean  target  detection  range  as  a  function  of  frame 
rate.  The  range  of  the  means  was  from  2060  meters  for  30  frames  per  second  to  1920 
meters  for  10  frames  per  second.  The  slowest  frame  rate,  one  frame  per  second,  was 
associated  with  a  mean  target  detection  range  of  1945  meters. 

f.  Image  Spatial  Compression  and  Frame  Rate.  Figure  6  presents  mean 
target  detection  range  as  a  function  of  image  spatial  compression  (bits  per  picture  ele¬ 
ment)  for  each  frame  rate  used  in  the  study.  Intervals  of  spatial  reduction  are  based 
on  powers  of  2.  That  is,  each  successive  level  of  reduction  would  contain  half  of  the 
bits  per  picture  elements  of  the  preceding  level.  Therefore,  the  abcissa  of  Figure  6 
presents  log2  bits  per  picture  element.  The  actual  number  of  bits  is  shown  in 
parenthesis  beneath  its  respective  Iog2  value. 

The  overall  interaction  effect  was  not  statistically  significant.  In 
essence,  the  mean  target  detection  range  found  was  proportional  to  log2  bits  per 
picture  element  over  the  range  of  0.5  to  8  bits  (log2  =  -1  to  3).  There  was  no  signifi¬ 
cant  difference  in  mean  target  detection  range  between  8  bits  per  picture  element  and 
analog  (log2  3  and  5.  respectively).  Moreover,  this  relationship  was  independent  of 
frame  rate. 


g.  Target  Vehicles  and  Spatial  Compression.  It  was  of  interest  to  compare 
the  results  obtained  with  the  smallest  target  (jeep)  with  those  of  a  large  target  (2'A- 
ton  truck)  to  see  whether  there  was  any  effect  due  to  target  size  or  any  target  size 
interaction  effects.  Figure  7  presents  mean  target  detection  range  as  a  function  of 
(log2 )  bits  per  picture  element  for  the  truck  and  the  jeep. 

As  may  be  seen  in  the  figure,  mean  target  detection  range  decreases 
approximately  linearly  when  plotted  against  log2  bits  per  picture  element.  The  loss 
in  detection  range,  however,  was  proportionately  greater  for  the  jeep  at  the  lower  levels 
of  bits  per  picture  element.  That  is,  in  comparing  observer  performance  of  analog 
video  vs  0.5  bit  per  picture  element,  the  mean  range  for  detection  of  the  truck  was 
reduced  (degraded)  by  36  percent  as  compared  to  56  percent  loss  for  the  jeep.  Thus, 
for  these  two  targets  the  interaction  of  target  type  with  spatial  compression  was 
statistically  significant. 
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IV.  DISCUSSION 


The  effects  of  image  spatial  compression,  temporal  compression,  and  the  inter¬ 
action  of  these  two  variables  in  relation  to  visual  target  detection  was  the  main  interest 
of  this  study.  A  variety  of  targets  were  used  which  varied  principally  in  si/e.  Initially, 
the  study  design  included  video  tapes  taken  with  either  of  two  ficTds-of-view  (6.5° 
and  16°). 

Observer  target  detection  performance  with  the  wider  angle  lens  was  so  poor  as  to 
be  beyond  the  range  of  consideration  in  the  remotely  piloted  vehicle  system.  All  data 
of  any  consequence,  therefore,  were  based  on  the  narrow  field-of-view  condition  and 
have  been  reported  accordingly. 

No  significant  interaction  was  found  between  spatial  compression  and  frame  rate 
(temporal  compression)  nor  did  frame  rate  by  itself  have  any  effect  on  observer  per¬ 
formance.  Mean  target  detection  range.  However,  did  decrease  essentially  as  log2  of 
spatial  compression,  and  this  effect  was  more  pronounced  for  the  smaller  (jeep)  target 
than  for  the  larger  (2/6-ton  truck)  target.  It  is  worthy  of  note,  moreover,  that  a  reduc¬ 
tion  in  bits  per  picture  element  from  analog  to  8  produced  no  loss  in  mean  detection 
range. 

Reduction  in  detection  range  due  to  spatial  compression  may  be  explained  in  that 
the  primary  visual  cues  associated  with  these  military  target  vehicles  appeared  to  be 
contrast  in  shadow  pattern  and,  possibly,  shape.  On  the  analog  (uncompressed)  tapes, 
the  vehicles  had  a  more  geometric  or  angular  appearance  than  did  the  competing  ele¬ 
ments  of  the  background  comprised  mainly  of  shrubbery  and  trees.  As  the  level  of 
spatial  compression  was  increased  (bits  per  picture  elements  were  reduced),  the  targets 
and  the  surrounding  shrubbery  became  visually  similar,  hence  the  observers  did  not 
report  detection  until  they  had  a  closer  view  of  the  scene. 

As  for  the  absence  of  the  effect  on  observer  performance  due  to  frame  rate  reduc¬ 
tion,  the  result  is  not  surprising  when  one  considers  that  the  targets  were  stationary. 
Even  though  the  video  camera  was  moving  when  the  tapes  were  made,  no  relative 
velocity  information  was  contributed  by  the  targets.  Therefore,  to  show  an  observer 
(for  example)  six  frames  in  1  second,  each  for  a  duration  of  1/6  second,  or  to  show 
him  three  frames  each  at  a  duration  of  2/6  second  would  have  little  bearing  on  detec¬ 
tion  range.  In  fact,  the  slower  frame  rates  (e.g.,  3  per  second  or  1  per  second),  provide 
a  more  stable  view  in  which  to  study  and  compare  contrast  patterns  on  the  screen.  In 
selecting  an  optimal  frame  rate,  however,  consideration  should  be  given  to  the  obser¬ 
ver’s  subjective  feelings  of  comfort  (or  fatigue)  under  prolonged  viewing.  This  matter 
was  not  formally  evaluated  in  the  present  study,  but  there  were  complaints  from  the 
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observers  that  the  three-frames-per-second  condition  was  annoying  and  uncomfortable. 
Use  of  lower  frame  rates,  such  as  one  frame  per  2  seconds,  for  example,  would  have  to 
consider  the  speed  of  the  camera  vehicle  and.  correspondingly,  the  detection  range 
forfeited  in  viewing  a  single  frame  for  2  seconds  or  more,  before  an  update  is  provided. 
That  question  might  be  better  addressed  by  measuring  time  to  detect  a  target  on  a 
given  frame  taken  at  a  given  range.  This  would  give  the  probability  of  target  detection 
at  a  given  range  and  frame  rate  when  in  the  dynamic  presentation  mode. 

V.  CONCLUSIONS 

In  conclusion,  the  results  of  this  investigation  suggested  that  for  stationary  mili¬ 
tary  targets  viewed  by  means  of  a  remotely  piloted  vehicle,  a  reduction  in  frame  rate 
from  30  to  1  frame  per  second,  does  not  adversely  affect  target  detection  range;  spatial 
compression  is  not  dependent  upon  the  accompanying  rate  of  frame  presentation. 
Finally,  smaller  targets  are  detected,  on  the  average,  at  a  shorter  range  than  are  larger 
targets,  and  this  difference  is  greater  at  higher  levels  of  image  spatial  compression. 
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